ai read
Can AI Read Between The Lines? Benchmarking LLMs On Financial Nuance
Kubica, Dominick, Gordon, Dylan T., Emura, Nanami, Saini, Derleen, Goldenberg, Charlie
As of 2025, Generative Artificial Intelligence (GenAI) has become a central tool for productivity across industries. Beyond text generation, GenAI now plays a critical role in coding, data analysis, and research workflows. As large language models (LLMs) continue to evolve, it is essential to assess the reliability and accuracy of their outputs, especially in specialized, high-stakes domains like finance. Most modern LLMs transform text into numerical vectors, which are used in operations such as cosine similarity searches to generate responses. However, this abstraction process can lead to misinterpretation of emotional tone, particularly in nuanced financial contexts. While LLMs generally excel at identifying sentiment in everyday language, these models often struggle with the nuanced, strategically ambiguous language found in earnings call transcripts. Financial disclosures frequently embed sentiment in hedged statements, forward-looking language, and industry-specific jargon, making it difficult even for human analysts to interpret consistently, let alone AI models. This paper presents findings from the Santa Clara Microsoft Practicum Project, led by Professor Charlie Goldenberg, which benchmarks the performance of Microsoft's Copilot, OpenAI's ChatGPT, Google's Gemini, and traditional machine learning models for sentiment analysis of financial text. Using Microsoft earnings call transcripts, the analysis assesses how well LLM-derived sentiment correlates with market sentiment and stock movements and evaluates the accuracy of model outputs. Prompt engineering techniques are also examined to improve sentiment analysis results. Visualizations of sentiment consistency are developed to evaluate alignment between tone and stock performance, with sentiment trends analyzed across Microsoft's lines of business to determine which segments exert the greatest influence.
- Research Report (0.82)
- Financial News (0.72)
ChatBCG: Can AI Read Your Slide Deck?
Singh, Nikita, Balian, Rob, Martinelli, Lukas
With the advanced vision capabilities of GPT-4o and Gemini Flash, an important question arises regarding the accuracy of these functionalities in practical business applications. Our assumption was that multimodal models are good at reading and summarizing charts. When given an image of a slide deck, they do a good job of summarizing key insights from it, often including relevant data points. Existing research into this question has evaluated the efficacy of LLM's when parsing tables [3], concluding that the LLMs were highly sensitive to input prompts which drive performance. Other works also evaluate LLMs ability to reason and read mathematical graphs [2] and find that GPT models outperform alternatives. This paper aims to explore whether multimodal models perform well on a variant of this skill - answering straightforward questions that require the models to pick out a number from a slide deck.
- Europe > France (0.04)
- Asia > Japan (0.04)
- South America > Brazil (0.04)
- (4 more...)
- Research Report (0.83)
- Questionnaire & Opinion Survey (0.70)
Watch This AI Read and Explain Computer Code
Under the hood, Gemini is an artificial neural net that is trained on almost 1 million examples of code-description pairs. Specifically, Gemini is a language model that utilizes the transformer architecture. Our dataset consists of examples in Python, Javascript, Java, Go, PHP and Ruby. In other words, Gemini is capable of understanding a wide array of programming languages. Since the dataset Gemini was trained on is so wide in programming examples, we theorized that Gemini might be highly robust in different programming scenarios.
Would You Let An AI Read Your Mind?
The 38 years old Finnish science fiction author, along with data scientist friend Samuel Halliday, got his hands on a simple wearable brain scanner and started wondering how he could use the technology to tell more engaging stories. So in 2012, they came up with a story that could be read wearing the wireless headset, and branch and change depending on whether the reader showed more affinity for life or death imagery. Think of it as a modern version of the text-only interactive games of the late 70's, or a Choose Your Own Adventure eBook, but where your brain's electrical activity determines the choices. The project has been open-sourced to encourage innovation, meaning with a $400 piece of hardware, some machine learning and writing skills, everyone can venture into the depths of the design space created by emerging brain-computer interface technologies. While there is a lot of fuss these days around whether we can make artificial intelligence (or AI) truly intelligent, giving'brains' to machines might not always be enough.